Why use R for geospatial analyses?

Since the early 2000s, an active community of R developers has built a wide variety of packages to enable R to interface with geographic data. The extent of the geographic capabilities of R is readily apparent from the many packages listed in the CRAN task view for spatial data. Although there many other tools exist to visualize geographic information - from full-scale GIS applications such as ArcGIS and QGIS to web-based tools like Google maps - using R for visualizing and analyzing geographical data has the benefit of enabling approaches that are customizable, transparent, and reproducible. Moreover, RStudio has made R more user-friendly for many, facilitating map-making with a panel dedicated to interactive visualization. Finally, and perhaps most importantly, many traditional GIS tools have limited functionality in the way of spatial statistics and analyses. With a small range of built-in functions, these tools often force users to use a number of different platforms for advanced analysis, modeling, prediction and simulation. Given the vast power of R as a statistical and visualization tool, processing geospatial data directly within R enables users to proceed to complex analyses and models within the same environment and even the same session.

A quick way to illustrate R’s flexibility and evolving geographic capabilities is interactive map making. As we’ll see later in this workshop, the statement that R has “limited interactive [plotting] facilities” (Bivand et al. 2013) is no longer true. Let’s have a look at just how easy it is to generate a beautiful interactive map in R, without even having to import any data:

library(leaflet)

leaflet() %>% 
  setView(-119.6081, 37.25281, zoom = 4) %>% 
  addProviderTiles("Esri.WorldImagery")

Here’s another crowd-pleaser - NASA’s Earth at Night tile:

library(leaflet)

leaflet() %>% addProviderTiles("NASAGIBS.ViirsEarthAtNight2012")

Challenge

By modifying the code above, try out 5 other Leaflet Provider Tiles and zoom in to your home town (hint: you’ll need to know the coordinates)! You can find a list of all the provider tiles available in Leaflet here.

R’s spatial ecosystem: an emerging geoverse?

Like many areas of software development, R’s spatial ecosystem is rapidly evolving. Because R is open source, these developments can easily build on previous work, by ‘standing on the shoulders of giants’, as Isaac Newton put it in 1675. This approach is advantageous because it encourages collaboration and avoids ‘reinventing the wheel’. The package sf, for example, builds on its predecessor sp.

A surge in development time (and interest) in ‘R-spatial’ has followed the award of a grant by the R Consortium for the development of support for Simple Features, an open-source standard and model to store and access vector geometries. This resulted in the sf package. Multiple places reflect the immense interest in sf. This is especially true for the R-sig-Geo Archives, a long-standing open access email list containing much R-spatial wisdom accumulated over the years.

The popularity of spatial packages in R. The y-axis shows average number of downloads per day, within a 30-day rolling window, of prominent spatial packages.

The popularity of spatial packages in R. The y-axis shows average number of downloads per day, within a 30-day rolling window, of prominent spatial packages.

It is noteworthy that shifts in the wider R community, as exemplified by the data processing package dplyr (released in 2014) influenced shifts in R’s spatial ecosystem. Alongside other packages that have a shared style and emphasis on ‘tidy data’ (including, e.g., ggplot2), dplyr was placed in the tidyverse ‘metapackage’ in late 2016. The tidyverse approach, with its focus on long-form data and fast intuitively named functions, has become immensely popular. This has led to a demand for ‘tidy geographic data’ which has been partly met by sf, which we will cover in this workshop, as well as other emerging approaches such as tabularaster, which we will not cover here. An obvious feature of the tidyverse is the tendency for packages to work in harmony. There is no equivalent geoverse, but there are attempts at harmonization between packages hosted in the r-spatial organization and a growing number of packages use sf.

An overview of R’s spatial ecosystem can be found in the CRAN Task View on the Analysis of Spatial Data (see https://cran.r-project.org/web/views/Spatial.html).

Resources

This workshop is based on the incredibly comprehensive and user-friendly book Geocomputation with R by Robin Lovelace, Jakub Nowosad, Jannes Muenchow.

Other useful resources for learning about GIS in R are: